Mixture of experts regression modeling by deterministic annealing
نویسندگان
چکیده
We propose a new learning algorithm for regression modeling. The method is especially suitable for optimizing neural network structures that are amenable to a statistical description as mixture models. These include mixture of experts, hierarchical mixture of experts (HME), and normalized radial basis functions (NRBF). Unlike recent maximum likelihood (ML) approaches, we directly minimize the (squared) regression error. We use the probabilistic framework as means to define an optimization method that avoids many shallow local minima on the complex cost surface. Our method is based on deterministic annealing (DA), where the entropy of the system is gradually reduced, with the expected regression cost (energy) minimized at each entropy level. The corresponding Lagrangian is the system’s “free-energy,” and this annealing process is controlled by variation of the Lagrange multiplier, which acts as a “temperature” parameter. The new method consistently and substantially outperformed the competing methods for training NRBF and HME regression functions over a variety of benchmark regression examples.
منابع مشابه
Deterministically annealed mixture of experts models for statistical regression
A new and e ective design method is presented for statistical regression functions that belong to the class of mixture models. The class includes the hierarchical mixture of experts (HME) and the normalized radial basis functions (NRBF). Design algorithms based on the maximum likelihood (ML) approach, which emphasize a probabilistic description of the model, have attracted much interest in HME ...
متن کاملStatistical Spatial Color Information Modeling in Images and Applications
Statistical Spatial Color Information Modeling in Images and Applications Walid Elguebaly Image processing, among its vast applications, has proven particular efficiency in quality control systems. Quality control systems such as the ones in the food industry, fruits and meat industries, pharmaceutic, and hardness testing are highly dependent on the accuracy of the algorithms used to extract im...
متن کاملAn average case performance of the deterministic annealing EM algorithm
Average case performance of the deterministic annealing EM algorithm is evaluated for Gaussian mixture estimation problem under some additive noises. The data-averaged EM update equations with respect to hyperparameters are calculated analytically in the large data limit. We find that the EM algorithm strongly depends on the initial conditions. Moreover, by using our analysis, it becomes possib...
متن کاملDeterministic Annealing Framework in MMMs-Induced Fuzzy Co-Clustering and Its Applicability
Initialization problem is a significant issue in FCM-type clustering models, in which alternative optimization is often started with random initial partitions and can be trapped into local optima caused by bad initialization. The deterministic clustering approach is a practical procedure for utilizing a robust feature of very fuzzy partitions and tries to converge the iterative FCM process to a...
متن کاملRobust mixture of experts modeling using the skew $t$ distribution
Mixture of Experts (MoE) is a popular framework in the fields of statistics and machine learning for modeling heterogeneity in data for regression, classification and clustering. MoE for continuous data are usually based on the normal distribution. However, it is known that for data with asymmetric behavior, heavy tails and atypical observations, the use of the normal distribution is unsuitable...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Signal Processing
دوره 45 شماره
صفحات -
تاریخ انتشار 1997